fix: optimize email search and require filters to prevent timeouts#117
fix: optimize email search and require filters to prevent timeouts#117Acid-Override wants to merge 15 commits intoai-zerolab:mainfrom
Conversation
Add two new MCP tools for email management: - mark_emails_as_read: Mark emails as read/unread using IMAP \Seen flag - move_emails: Move emails between mailboxes using MOVE (RFC 6851) with fallback to COPY+DELETE for older servers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fix bug where status messages like "SEARCH completed (took 5 ms)" were incorrectly parsed as email UIDs. The number in the timing info (e.g., "5") was being treated as a valid UID. Add _parse_search_response() method that: - Detects status messages by checking for keywords - Returns empty list for status-only responses - Only returns actual numeric UIDs from valid responses Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add new MCP tool to list all mailboxes (folders) in an email account. Returns mailbox name, flags, and hierarchy delimiter. Useful for discovering folder names like Archive, Sent, Trash which may vary across email providers (e.g., iCloud uses different names). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Performance optimization: - Don't fetch INTERNALDATE for all emails when paginating - Use UID ordering directly (UIDs are ascending by add date) - Only fetch headers for the requested page New search_emails tool: - Server-side IMAP SEARCH (fast even with thousands of emails) - Search in: all (TEXT), subject, body, or from - Paginated results Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use ternary operator in _parse_search_response (SIM108) - Replace raise Exception with logging + failed_ids (TRY301) - Add tests for list_mailboxes, search_emails, mark_emails_as_read, move_emails - Update test_get_emails_stream to match optimized pagination behavior Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
a340e9a to
606255c
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Merge Conflict with PR #116
|
| Component | PR #116 | PR #117 | Status |
|---|---|---|---|
| Pagination strategy | UID-based | INTERNALDATE-based | ❌ Conflicting |
| Date fetching | Removes | Keeps | ❌ Conflicting |
| New tools | 5 new methods | Filter validation | ✅ Compatible |
| Caching | Not used | Added |
💡 Recommended Solution
These PRs solve complementary problems and should both merge:
- PR feat: add email management tools (mark read, move, search, list mailboxes) #116: Optimizes pagination (O(n) → O(page_size))
- PR fix: optimize email search and require filters to prevent timeouts #117: Prevents expensive searches (filters required)
Suggestion: Combine both optimizations by:
- Use PR feat: add email management tools (mark read, move, search, list mailboxes) #116's UID-based sorting (faster, no date fetching needed)
- Keep PR fix: optimize email search and require filters to prevent timeouts #117's filter validation (prevents expensive searches)
- Add PR fix: optimize email search and require filters to prevent timeouts #117's caching on top of PR feat: add email management tools (mark read, move, search, list mailboxes) #116's optimized code
📋 Files Affected
mcp_email_server/emails/classic.py(both PRs modify ~20-40 lines)mcp_email_server/app.py(PR feat: add email management tools (mark read, move, search, list mailboxes) #116 adds tools, PR fix: optimize email search and require filters to prevent timeouts #117 unaffected)
✅ Next Steps
Either approach works:
- Option A: Merge PR feat: add email management tools (mark read, move, search, list mailboxes) #116 first, rebase PR fix: optimize email search and require filters to prevent timeouts #117 on top
- Option B: Custom merge combining both optimizations
- Option C: Coordinate with PR feat: add email management tools (mark read, move, search, list mailboxes) #116 author for synchronized merge
Both changes are valuable and compatible in intent—just need careful merge strategy.
- Cache search total to eliminate duplicate IMAP search (~50% performance improvement) - Add validation requiring at least one filter (date, subject, from, to, seen, flagged, or answered) - Prevents expensive 'ALL' email searches on large mailboxes that could timeout - Provides clear, helpful error message when no filters are specified Fixes issue where list_emails_metadata() would hang indefinitely on large mailboxes when called without date filters.
- Mock _last_search_total to replace the removed get_email_count call - Add 'before' filter to test_get_emails_with_mailbox to satisfy validation - Remove unused mock_count assertions that no longer apply
- Split long filter validation line for readability
- Add test_get_emails_requires_filter to test ValueError when no filters provided - Improves code coverage for the validation error message - All 131 tests passing
…ance - Clarify that combining filters is recommended best practice - Text searches alone work but can be slow on massive mailboxes - Suggest combining date ranges with optional text filters - More helpful examples for users
…requirements - Explains IMAP protocol limitations preventing 'first N emails' queries - Documents why filters are required to prevent expensive mailbox scans - Provides performance comparison of different search strategies - Includes best practices and examples for efficient searches - Migration guide for users who relied on unfiltered searches - FAQ addressing common questions and concerns This helps users and maintainers understand the architectural decision and provides clear guidance on optimized email searching.
for more information, see https://pre-commit.ci
- Add EMAIL_SEARCH_PERFORMANCE.md to nav configuration - Fix broken reference to ../README.md by using GitHub repo link instead This resolves the mkdocs build failure in strict mode where undocumented files and relative links outside the docs directory are not allowed. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
… (search optimization + filter validation) Combined optimizations: - PR ai-zerolab#116: Add list_mailboxes, move_emails, mark_emails_as_read, search_emails tools - PR ai-zerolab#116: UID-based pagination optimization (60s+ → <5s on large mailboxes) - PR ai-zerolab#117: Filter validation (prevents accidental expensive searches) - PR ai-zerolab#117: Search result caching (_last_search_total) Conflict resolution: Merged parse_search_response logic with caching to avoid duplicate searches. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
741a750 to
b2c3298
Compare
✅ Merged with PR #116 - Combined Optimizations CompleteWe've successfully resolved the merge conflict between this PR (#117) and PR #116 by combining both optimizations into a single, comprehensive solution. 🎯 What We DidMerged both PRs into a unified branch:
Conflict Resolution:
✨ What We GotPerformance Optimizations:
New Email Management Tools:
🧪 TestingVerified working:
📊 ImpactThis combined approach gives users both fast pagination AND safe search operations. The filter validation prevents the timeout issues that would occur with large mailboxes, while the UID-based optimization makes pagination itself lightning-fast. Branch: fix/optimize-email-search-and-require-filters |
Summary
Addresses timeout issues in
list_emails_metadata()by optimizing IMAP searches and preventing expensive 'ALL' email searches on large mailboxes.Problem
list_emails_metadata()is called without date filters, it performs an expensiveuid_search("ALL")operationget_emails_metadata_stream()and again inget_email_count()Solution
1. Search Optimization (~50% faster)
get_emails_metadata_stream()get_email_count()_last_search_totalinstead2. Safety Guard (Prevents hangs)
get_emails_metadata()requiring at least one filtersince,before,subject,from_address,to_address,seen,flagged,answeredValueErrorwith guidance if no filters providedChanges Made
File:
mcp_email_server/emails/classic.pyEmailClient.init (line 104-105): Add cache for search total
self._last_search_total = Noneget_emails_metadata_stream() (line 452-453): Cache the search result
self._last_search_totalClassicEmailHandler.get_emails_metadata() (line 949-957):
get_email_count()Benefits
Testing
Tested with the Galaxia email account: